AD-A072  406 


UNCLASSIFIED 


«o 

A072406 


SOUTHERN  METHOOIST  UNIV  DALLAS  TX  DEPT  OF  OPERATIONS— ETC  F/6  12/1 
ANOTHER  versatile  FAMILY  OF  PROBABILITY  DISTRIBUTIONS. (U) 

DEC  7B  B W SCHMEISER.  R LAL  N00014-77-C-0425 

OREM-78015  NL 


9 --76 


DOC  JILE_C0PY'  ABA072406 


Department  of  Operations  Research 
and  y 

Engineering  Management 
Southern  Methodist  University 
Dallas,  Texas  75275 


December  1978 


This  research  was  srpported  by  the  Office  of  Naval  Research, 
Code  431,  Contract  N00014-77-C-0425 . » 


?9  08  06  084 


ABSTRACT 


A new  four  parameter  family  of  probability  distributions  is 

described.  Special  cases  include  Bernoulli  trials,  uniform,  power 

series,  exponential,  triangular  and  Laplace  (double  exponential) 

distributions.  Statistical  properties  , parameter  determination 
and  random  variate  generation  are  discussed. 

KEY  WORDS 


Family  of  Distributions 


I.  INTRODUCTION 


Considered  in  this  paper  is  a four  parameter  family  of  proba- 
bility distributions  having  density  function 

f(x)  - [1+(x-X1)/(X2A4))(1"X3)A3/(X2A3)  if  XfX2X4lx<X1 

(1) 

- [1-(x-X1)/(X2(1-X4))](1_X3)A3/(X2X3)  if  Xj  < x < XL+X2(1-X4) 

» 0 otherwise 

where  - 00  < X,  < 00 , 0 < A„  < 0 < A,  < 00  and  0 < A.  <1. 

1 2 3 — 4 — 

This  family  has  value  due  to  its  ability  to  model  a variety  of  probabi- 
listic phenomena  by  varying  only  the  four  parameters. 

There  is  much  literature  on  methods  for  modeling  random  variables 
using  simple,  yet  versatile,  methods.  Schmeiser  [11],  in  a survey  of 
these  methods,  in  the  context  of  computer  simulation,  discusses  systems 
of  distributions  (Pearson  and  Johnson),  approximations  to  the  inverse 
distribution  function  (expansion  techniques,  polynominal  regression, 
arid  rectangular  approximation),  and  four  parameter  distributions  (beta, 
four  parameter  gamma.  Burr,  generalized  lambda,  and  absolute  lambda). 

The  distribution  developed  in  this  paper  is  a four  parameter 
family.  As  such,  it  is  appropriate  to  survey  the  literature  of 
four  parameter  distributions.  Stacy  [14],  Stacy  and  Mihram  [15] 
and  Harter  [4]  discuss  the  four  parameter  gamma  distribution,  which 
includes  both  the  three  parameter  Weibull  and  gamma  distribution. 

This  distribution  is  useful  for  modelling  life  times  with  range  [0,°°). 
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jam 


However,  parameter  estimation  is  not  straight-forward.  Burr  [1,2] 
developed  a more  flexible  distribution  with  the  same  range  and 
straight-forward  random  variable  generation.  Disadvantages  are 
that  both  heavy  and  light  tailed  distributions  are  not  obtainable, 
nor  are  symmetric  distributions.  Ramberg  and  Schmeiser  [7,  8] 
generalized  Tukey's  lambda  distribution  [18]  to  obtain  a family  for 
all  but  light  tailed  distributions,  including  symmetric  distri- 
butions. The  exponential  distribution  is  a limiting  case  and  the 
normal  distribution  function  is  approximated  within  a tolerance  of 
.001.  (See  Joiner  and  Rosenblatt  [5]  for  a related  approximation.) 
Random  variate  generation  is  straight-forward.  However,  parameter 
estimation  requires  tables  such  as  given  in  [3]  and  [10]  for  matching 
moments.  Other  estimation  techniques,  such  as  maximum  likelihood 
estimation,  are  not  straight-forward  due  to  the  distribution  function 
and  density  function  not  being  expressable  in  closed  form. 

Schmeiser  and  Deutsch  [13]  describe  a family  of  distributions 
which  can  be  used  to  obtain  a distribution  having  any  first  four 
moments.  The  exponential  distribution  is  a limiting  case.  Closed 
form  expressions  are  given  to  calculate  parameter  values  to  match 
desired  mode,  percentile  of  the  mode  and  any  other  two  quantiles. 
Moments  may  be  matched  graphically.  (No  tables  exist  to  date.)  Random 
variable  generation  is  closed  form  and  requires  only  one  exponential 
operation.  The  disadvantage  of  this  distribution  is  that  while  any 
four  independent  properties  may  be  specified,  the  shape  of  the  distri- 
bution is  most  satisfactory  as  a quick  and  dirty  technique.  The 
shortcomings  involve  truncated  tails  and  a density  function  value  at 
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the  mode  which  takes  on  only  the  values  zero,  one,  and  infinity. 


The  family  of  distributions  developed  in  the  following  sections 
provides  a model  which  allows  for  closed  form  parameter  determination 
in  some  cases,  has  a finite  density  at  the  mode,  does  not  truncate 
the  tails  and  provides  a model  for  any  mean  and  variance  and  a 
wide  range  of  commonly  used  third  and  fourth  moments. 

2.  PROPERTIES  OF  THE  FAMILY 

This  section  developes  the  properties  of  the  family  having 
density  function  given  in  equation  (1).  From  the  density  function, 
the  distribution  function  is  seen  to  be: 


F(x) 


0 

A4[1+(x-A1)/(A2A4)]1A3 

1-(1-A4)[1-(x-A1)/(A2(1-A4))]1/X3 

1 


if  x < A,  - A„  A. 

I 2 4 

if  A,  - A.A.  < x < A. 

1 2 4 — — l 

if  < x < X1  + A2(l-A4) 

if  x > A^  + A2(l-A4) 


From  the  density  and  distribution  functions,  the  role  of  each  of  the 
four  parameters  is  clear.  The  parameters  A^  and  A2  determine  location 
and  scaling,  respectively.  More  specifically,  the  single  mode  or 
anti-mode  is  at  A^  and  A4  is  the  probability  that  an  observed  value 
is  less  than  or  equal  to  A^.  Thus  A4  may  be  used  to  determine  the 
degree  of  asymmetry  with  A4  ■ .5  yielding  symmetric  distributions. 
Finally  A^  corresponds  most  closely  to  the  distribution  tail  weight. 

In  combination,  A^  and  A4  determine  the  third  and  fourth  standardized 
moments. 


i 
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That  these  parameters  yield  a wide  range  of  shapes  can  be  seen 
by  considering  some  special  cases.  Bernoulli  trials  and  the  Laplace 
(double  exponential)  distribution  are  symmetric  limiting  cases.  As 
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A^  approaches  infinity  with  A.,  * A^  and  - 1,  the  probability  that 
X ■ 0 is  A^  and  the  probability  that  X * 1 is  1 - A^.  As  A^  approaches 
zero,  the  limiting  distribution  is  the  Laplace.  Since  Bernoulli 
trials  correspond  to  the  minimum  possible  fourth  moment  for  a given 
third  moment  and  the  Laplace  has  a fourth  standardized  moment  (kurto- 
sis)  of  six,  it  is  seen  that  a wide  range  of  distribution  shapes 
may  be  obtained. 

Another  important  asymmetric  limiting  case  is  the  exponential. 

By  letting  A^  = 0,  Aj  * A/A^  and  A^  * 0,  the  limiting  case  as  A^ 
approaches  zero  is  exponential  with  mean  A,  as  can  be  seen  from 

lim  F(x)  = lim  1 - [1  - x/(A/A,)  ]1/X3  0 <_  x < » 

A^-*-  0 A^-»-  0 

=»  l — e 0 < x < 00 


which  is  the  distribution  function  of  the  exponential  distribution 


having  mean  A. 

Additional  special  cases 
having  density  function 


are  the  power  series  distribution 

° < x <_  1 

elsewhere 


obtained  when  A.  ■ A_  * A,  “1  and  A_  ” 1/A.  The  uniform  distri- 
12  4 3 

bution  over  the  interval  [a,b]  is  obtained  by  setting  A^  » a + (b-a) 
A^,  A^  ■ b-a,  A^  ■ 1 and  A^  to  any  value  in  the  unit  interval  [0,1]. 
Both  the  power  series  distribution  and  the  uniform  are  special  cases 
of  the  beta  distribution. 

Some  graphs  may  illustrate  the  versatility  of  the  family. 

Figure  1 shows  graphs  of  the  distribution  for  A^  * .5  for  various 
values  of  A^.  Changing  A^  and/or  A^  would  change  the  location  and/ 
or  scale,  but  not  the  shape,  of  the  distribution.  Since  A^  = .5, 
the  distribution  is  symmetric  for  all  A^.  Figure  2 shows  graphs 
corresponding  to  A^  - .75.  Here  75%  of  the  area  under  each  curve 
lies  to  the  left  of  the  mode  (anti-mode)  A^  for  all  A^. 


FIGURES  1 & 2 ABOUT  HERE 


The  generation  of  random  variates  from  this  family  is  straight 
forward  via  the  inverse  distribution  function 

x - F_1(p)  - Ax  - A2AA[l-(p/A4)X3]  if  P £ X4 

(2) 

- Ax  + A2(l-AA)[l-((l-p)/(l-A4))A3]  if  P > Aa 

Insertion  of  a U(0,1)  (pseudo)  random  variate  into  the  inverse 
distribution  function  generates  a value  x from  the  distribution 


5 


determined  by  X^,  X 2,  X^,  and  X^.  This  generation  is  relatively 


fast  in  that  it  requires  only  one  exponentiation.  In  addition,  the 
closed-form  inverse  distribution  function  allows  the  direct  genera- 
tion of  order  statistics  via  the  methods  of  Lurie  and  Hartley  [6] , 
Schmeiser  [12],  Ramberg  and  Tadikamalla  [9]  and  Schucany  [14]. 


The  moments  of  the  distribution  are  now  derived.  Letting  X 


E {X  |XX  - 0} 


u-x4)x2 


xf(x)  dx 


-v2 


(X2X4[(p/X4)X3  -l]}kdp  + j1  (X2(l-X4)[l-((l-p)/ 
0 X, 


(1-X4))  X3]}kdp 


Xk  (Xk 


^ [j0  (-1)1  Q (p/X4)X3(k'i)]dp 


+ (1-X4)k  j 1 [JQ  (-1)1  Q ((l-p)/(l-X4))V 


]dp} 


Xk  {\k 

A2  U4 


Jo  (-ljl  (0 


(X4)'X3(k-i) 


4 PX3(k-i>  dp 


(1-X4)k  Z^  (-1)1  Q (1-X4)‘X31  (l-p)X3i  dp} 


. k r , k+1 


k ( 


-u1  (fj 


2 «r*  A * «-\)k+1  j 


4 

k ( 


-1)1  (i) 


1 


k k ( 

A,  Z 

Z i=0  “3 


ttM-  «i-*4>t+1  + <• 


l)k  X4k+1} 


(A2A3r(k!)  [ Z 
i=0 


("l1)  <- 


A, ) ] / TT  (A.i+1) 

i=l  J 


For  any  value  of  A^,  some  algebraic  reduction  yields 


E{X}  = \+  [A2A3(1-2A4)]/(A3+1) 

and 

V{X}  = (A2A3)2  [2A4(1-A4)(A3-1)  + 1]/[(2A3  + 1) (A3  + l)2] 

3.  PARAMETER  DETERMINATION 

The  four  parameters  of  this  family  allow  four  independent 
properties  to  be  specified.  These  properties  may  be  moments, 
fractiles,  upper  and  lower  bounds,  or  mode  when  these  properties 
are  known.  Alternately,  the  parameters  may  be  estimated  from  sample 
data  using  the  classical  techniques  of  maximum  likelihood,  leaot 
squares,  or  method  of  moments,  although  these  methods  do  not  lead 
to  closed-form  solutions. 

3.1  Obtaining  Specified  Properties  When  No  Data  is  Available. 

Often  a distribution  is  desired  which  possesses  certain  proper- 
ties. This  happens  when  the  only  available  data  is  from  expert  opinion 
as  to,  say,  the  mode,  the  fractile  of  the  mode  and  bounds  on  the  range 
of  the  variable.  "No  data"  situations  commonly  occur  in  PERT/CPM 


it. 
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modeling  where  the  activity  has  been  performed  seldom,  if  ever,  before. 
Whenever  a new  system  is  being  designed,  and  therefore  no  data  is 
available,  the  use  of  specified  properties  to  determine  parameter 
values  is  necessary.  For  the  family  discussed  in  this  paper,  four 
independent  properties  may  be  specified,  whereas  for  most  commonly 
used  distributions  only  one  or  two  properties  can  be  obtained. 

Parameters  for  this  family  may  be  determined  in  a closed- form 
manner  from  the  interpretation  of  the  four  parameters.  In  particular, 
letting  m denote  the  mode,  a the  lower  bound  and  b the  upper  bound, 


A.  * m 

(3) 

A,  * fractile  of  the  mode  X, 

4 1 

X2  = b-a  (A) 

A 3 = ln[l  +(x1-A1)/A2A^]/ln[p1/A4]  if  p1  £ A^ 

« ln[l-(x1-A1)/(A2(l-A4))]/ln[(l-p1)/(l-A4)]  if  P],  > A4 


yields  the  fractile  p^  at  x = x^. 


Other  properties  may  be  combined  to  yield  parameter  values. 
In  particular 

^1  ~ ^2^4  = a’  (6) 

xl  + a2u-a4)  = b, 

A + *2X3(1"2X4)/(X3  + * mean 
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(A^A^)^  [2A^(1-A^)  (A^  -1)  + 1 ] / [ (2 A^"*"  1)  (A^+  1)^]  = variance 


3.2  Parameter  Estimation  from  Data. 

When  data  x.  , x„ , . . .,  x are  available,  the  maximum  likelihood 
1 l n 

estimates  may  be  used.  Differentiating  the  likelihood  function  with 
respect  to  each  of  the  four  parameters  leads  to 


Al: 


V 


1 1 n 
I [A  + *]‘A  = I [1-Aa  - z ] 
i=0  4 1 i-n  +1 


-1 


n A / (A3-  1)  = I z./(A4  + z±)  - l z. 

i=l  i=n,+l 


/ (1-A, 


1 n 

A,:  A,  = - { l ln[l+z  /A, ] + £ ln[l-z  / (1-A, ) ] }/n 

3 3 i=l  1 i=ni+l 

nl 

A4:  (A4-l)/A4  = { l z./(1-A4-z i))/  l zi/(A4+zi) 

i=n^+l  i=l 

where  z^  = (x^  - A^)/A2  and  n^  is  the  number  of  x.  values  less  than 

V 


Some  care  must  be  taken  in  using  these  maximum  likelihood  equa- 
tions, as  with  any  family  of  distributions  which  includes  points  at 
which  the  density  function  is  infinite.  Taken  literally,  the  equa- 
tions always  will  lead  to  a U-shaped  distribution  (A^  > 1)  with  end 

points  at  x,,.  and  x,  . (the  first  and  nth  order  statistics),  since 
(1)  (n) 

the  likelihood  function  is  then  infinite.  When  the  distribution  is 


known  not  to  be  U shaped,  the  estimates  can  be  improved  by  speci- 
fying the  lower  bound  a and  the  upper  bound  b.  Then  = b-a  and 
A^  * a + (b-a)  A^. 

A modified  maximum  likeliho.od  algorithm  is  then 


1.  Set  A2  = b-a,  from  equation  (4). 

2.  Solve  for  A^,  in  equation  (8),  keeping 
A^  = a + (b-a)A^  from  equation  (6)  and 

n^  = [n  A^].  (Any  standard  unidimensional 
search  technique,  such  as  binary  search, 
may  be  used. ) 


Set  A,  = - { E In  [l+z./A.l  + ^ lnll-z  /(1-A, ) ] }/n 

3 i=1  i 4 i=n^+l  l 4 


4.  EXAMPLE 

In  modeling  a logic  network,  the  propogation  delay  of  each 
component  is  of  interest.  Using  the  Texas  Instruments  TTL  Data 
Book  [17],  the  minimum  time,  typical  time  and  maximum  time  for  a 
positive-NAND  gate  is  2,  7 and  15  nannoseconds  (ns),  respectively. 
In  a particular  situation,  it  is  desired  to  have  1%  of  the  gates 
exceeding  the  stated  maximum  of  15  ns  with  the  true  maximum  being 
20  ns.  This  can  be  modeled  using  parameter  values 


2 

V4 


7 ns 

from  equation 

(3), 

range  = 18  ns 

from  equation 

(4), 

.278 

from  equation 

(6), 
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and 


A^  = .2234  from  equation  (5). 

If  simulation  is  to  be  performed,  random  variates  may  be  generated 
by  substituting  U(0,1)  values  p into  equation  (2) 

x = 7 - 18  (.278)  [1  - (p/,278)'2234]  if  p < .278 

=7+18  (.722)  [1  - ((1  - p)/.722)’2234]  if  p > .278. 

5.  SUMMARY 

A four  parameter  family  has  been  developed  which  has  several 
appealing  properties  when  used  for  probabilistic  modeling.  In  addi- 
tion to  the  versatility  in  terms  of  the  shapes  which  the  distributions 
assume,  the  density  function,  cdf,  inverse  cdf,  and  moments  are  all 
closed  form.  Parameter  determination  is  closed  form  in  many  situations 
where  specified  properties,  such  as  mode  and  fractiles,  are  required. 
Maximum  likelihood  estimation  is  considered  and  an  example  is  given. 
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A new  four  parameter  family  of  probability  distributions  is  described. 
Special  cases  include  Bernoulli  trials,  uniform,  power  series, 
exponential,  triangular  and  Laplace  (double  exponential)  distributions. 
Statistical  properties,  parameter  determination,  and  random  variate 
generation  are  developed. 
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